fix: use max_output_tokens when available in LiteLLM fetcher #8455
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This PR fixes an issue where LiteLLM was incorrectly using
max_tokensinstead ofmax_output_tokensfor themaxTokensfield, causing errors with Claude Sonnet 4.5 via Google Vertex.Problem
When using Claude Sonnet 4.5 via Google Vertex through LiteLLM, requests were failing with:
The issue was that the code was using
max_tokens(which can be 200k for context) instead ofmax_output_tokens(which is limited to 64k for output).Solution
Modified the LiteLLM fetcher to prefer
max_output_tokenswhen available, falling back tomax_tokensfor backward compatibility:maxTokens: modelInfo.max_tokens || 8192maxTokens: modelInfo.max_output_tokens || modelInfo.max_tokens || 8192Testing
Added comprehensive test coverage to verify:
max_output_tokensis preferred when both fields are presentmax_tokenswhenmax_output_tokensis not availableAll existing tests continue to pass.
Fixes #8454
Important
Fixes LiteLLM fetcher to prefer
max_output_tokensovermax_tokens, resolving token limit errors with Claude Sonnet 4.5.max_output_tokensinstead ofmax_tokensformaxTokensfield inlitellm.ts.max_tokensifmax_output_tokensis unavailable.litellm.spec.tsto verify preference formax_output_tokensand fallback behavior.This description was created by
for a113acc. You can customize this summary. It will automatically update as commits are pushed.